home *** CD-ROM | disk | FTP | other *** search
- Hide4PGP v1.0 - first public version Dec 20, 1996
-
- copyrighted software by Heinz Repp, Germany
- email: 100434.2106@compuserve.com
-
- Contents:
-
- 1 - Introduction
-
- 2 - Commandline Parameters
-
- 3 - Technical
-
- 4 - Sourcecode
-
- 5 - Other Sources
-
- 5 - Legal Stuff
-
-
-
-
- 1 - Introduction
-
- The idea of writing this program came with the public discussion of
- limiting or disallowing cryptography for the public, as in France,
- upcoming also in the US and Germany. The tradeoff is to make
- encrypted data invisible by hiding them in other files - this is
- called steganography. As encrypted data themselves already resemble
- pure (white) noise they should only add some extra noise to other
- data. Henry Hastur was one of the first to add a little program
- (Stealth) to the PGP distribution (Pretty Good Privacy by Phil
- Zimmermann) to remove (and later add) the PGP header leaving pure
- white noise. I use PGP since years, so I had PGP and Stealth in mind
- when planning to write this program. Stealth acts like a UNIX filter,
- so my program should receive or send the encryted data from/to a pipe.
- It also works with I/O-redirection ('<' and '>' on the commandline).
- Unlike other available steganography programs it has no own
- encryption. I think that PGP itself is unbreakable, additional
- encryption doesn't make it any stronger. In the contrary: adding an
- own cipher adds a header that can be traced. Just think of extracting
- every 1, 2, ... least significant bits and looking there for known
- header structures or coded end of file marks. PGP files treated with
- Stealth have no such weakness.
-
- I also had some look at White Noise Storm (tm) by Ray Arachelian and
- Steganos v1.4 by Fabian Hansmann. Steganos came closest to what I
- wanted, and the choice of VOC, WAV and BMP files for steganography was
- inspired by it. Both programs do a good job with real quantitative
- data, that means true color bitmaps. Greyscale pictures sometimes
- made problems, and 256 colors palette BMP usually showed dramatic
- changes. What is the reason? Now pixels in 8 bit bitmaps don't hold
- a real color but a index to a color table, the palette. Changing the
- least significant bit with real quantitative data makes a difference
- of 1 in a range of 255, less than 0.4%, with 2 bits it's 3 out of 255
- or 1.2% - still hard to recognize. With palettes the difference of 1
- bit means a different palette entry, and this can be a completely
- different color. Greyscale pictures have also a palette. Often this
- palette contains grey 'colors' in strictly ascending order of
- brightness making the palette indices resemble true gray values. But
- sometimes also the palette of a greyscale picture is arbitrarily
- mixed. To make little changes in palette indices correlate to small
- color changes, the colors of neighbouring palette positions must be as
- similar as possible, or identical. There are two ways to achieve
- this, and they can be combined. First many bitmaps do not use all
- available palette colors or positions. So one can DUPLICATE as many
- entries as possible - preferably the ones most frequently used. If
- those duplicates share the same 7 higher bits, the contents of the
- last bit is arbitrary: the color of the pixel stays the same. With
- quadruplicated entries one can change the last two bits without any
- change of the pixel appearance a.s.o. The second way is to rearrange
- the palette entries in a way that similar colors are PAIRED together.
- This purpose is not a trivial one, because trying all possibilities is
- far beyond capacity. So some kind of heuristic procedure must do the
- job. Last not least I wanted to be able to use more bits than always
- one: up to 8 with 16 bit data (still less than 0.4%!) should be
- possible.
-
- Those were the presuppositions the program should comply with. And
- that is what has come of it: a program that handles OS/2 1.x, Windows
- 3.x or OS/2 2.x uncompressed single image bitmaps with 256 or 16.7 mio
- colors, WAV with 8, 12 or 16 bit uncompressed mono or stereo sound
- samples and 8 bit VOC files. I decided to support only bitmaps
- because nearly every graphics program can read and write it, and it is
- easy to convert the bitmaps after the steganographic process into
- another graphics format. The only prerequisition is the palette must
- be preserved and if compression is used it must be without loss, so
- JPEG doesn't work, but GIF, TIFF, ... do. So with this program most
- of the pictures e.g. used in the WWW / Internet can be used for
- steganography. Use the GIF's of your Homepage ...
-
-
- 2 - Commandline Parameters
-
- The syntax is simple: Hide4PGP needs always a file to work on (BMP,
- WAV, or VOC) and optionally some switches. The file needs not to have
- the proper extension: Hide4PGP recognizes the format by the header,
- not by the name. Secret data to be stored into the file is read from
- the standard input, but that is usually the keyboard. So you may want
- to specify a pipe ('|' symbol) to use the output of another program,
- or redirect a file (with the '<' symbol). Secret data extracted out
- of a file is send to the standard output, that is normally the screen.
- Again you may use a pipe or redirect the output to a file ('>'
- symbol). This ideally works with PGP and Stealth in series, e.g.
-
- pgp -ef user-ID < file-2-hide | stealth | Hide4PGP datafile
-
- encrypts the file <file-2-hide> for the receiver <user-ID>, then
- removes the header information leaving pure pseudo random data, and
- hides them in the <datafile>. Alternatively you may specify
-
- Hide4PGP datafile < secret.dat
-
- to hide secret.dat in datafile.
-
- Hide4pgp -x datafile | stealth -a user-ID | pgp -f > decrypted-file
-
- extracts the data previously stored in <datafile>, then adds a header
- for <user-ID>, decrypts the file and stores the plaintext data in
- <decrypted-file>; remember you must set your passphrase before in
- PGPPASS when using PGP for decryption in filter mode (or use the -z
- switch)! Alternatively,
-
- Hide4PGP -x datafile > secret.dat
-
- extracts the data to secrets.dat.
-
- Options or switches may be anywhere on the commandline, starting with
- '-' or '/'. Options may be in any order and case, and separate or
- combined. Legal options are:
-
- x eXtract; the default action is hiding.
-
- 1,2,4 or 8 means the number of least significant bits to be used;
- without this parameter the program defaults to 1 bit
- with 8 bit data and 4 bit with 16 bit data. 8 bit are
- only allowed with 16 bit data - this should be clear.
- When extracting you must specify the same parameter as
- when the file was stored - it is not recorded. (If you
- forgot or don't know: just try the few possibilities.
-
- d[=] Duplicate palette entries: unused palette entries are
- replaced with duplicates of those often used. The
- number of replicates of one entry is always a power of
- 2, this allows the use of the corresponding number of
- least significant bits without effect on the pixels of
- this color.
- If 'd=' is specified then all entries are treated
- equal if less than 128 colors are used, i.e. all
- colors appear 2, 4, 8 or ... times. In this case
- probably not all palette positions will be used. I
- have no good idea what this can be for, maybe to spare
- some palette positions for the system palette.
- The -d option works only with 256 colors bitmaps.
-
- p[+] Pair similar palette entries: Hide4PGP uses two
- different approaches, one quick single pass and one
- iterative procedure. The results are in most
- situations very similar, so the default (p alone) is
- the quicker first method, 'p+' selects the iterative
- one. The latter may sometimes need a not predictable
- longer period, up to one or two minutes, because it
- may lead to stable intermediate states where it tries
- to escape using also random numbers. It will escape,
- but sometimes (seldom) this takes its time. With a
- certain bitmap one should try which one gives better
- results. The -p option works also only with 256 colors
- bitmaps.
-
- v[-] Verbose: gives detailed information about the data
- file and the steganographic process. 'v-' suppresses
- all output except messages of fatal error situations
- that lead to program abort (UNIX style: say nothing as
- long everything is o.k.).
-
- h or ? Help; if Hide4PGP finds this option, it shows a help
- page with short description of all parameters and
- terminates.
-
-
- Hint: It is advantageous to combine d and p switches. Many 256 color
- bitmaps can be used with the d and p (or p+) switch combined and even
- 2 bit / Pixel with little if any change to the picture!
-
-
- 3 - Technical
-
- Hide4PGP works only on real data in the data file and leaves unused
- space untouched. This allows the data file be converted into other
- picture or sound formats that leave the data intact.
-
- Usually the secret data don't use the whole data file - after them
- random data is inserted to prevent easy detection of modified in
- contrast to unmodified data. There is no end of file mark stored with
- the secret data stream, so on extraction always the least significant
- bits of the whole file are extracted. The cryptographic software must
- recognize the end of encrypted data by itself - as PGP does with an
- encryted end of file sign. If the secret data cannot be stored
- completely a warning is given. It still can be extracted (the part
- that has been stored), but some encryption software like PGP will not
- decrypt it at all. As a rule of thumb using 1 bit requires a datafile
- size about 8 times the secret data, 4 times with 2 bit, 2 times with
- 4. Double the requirement with 16 bit data.
-
- 256 colors bitmaps have often only 100 or less colors. The
- duplication routine sorts the palette entries according to their
- frequency, then duplicates them in order, more frequently used first.
- If there is place left, the entries are quadruplicated, and so on.
- Sometimes it may only be necessary to duplicate all entries, as this
- allows using the least significant bit without altering the picture,
- or take every entry 16 times because the bitmap has been converted
- from a 16 colors format. The '=' suboption uses an abbreviated code
- for this purpose, but there is no disadvantage to leave it off.
-
- One word of warning: using palette entry duplication allows
- sophisticated software to detect altered bitmaps because these
- palettes have an unusual structure. But those changes can only be
- seen when looking on the palette information - the casual viewer will
- not recognize any difference.
-
- The two methods for pairing the entries of a 256 color palette need a
- measure of similarity. As distances in the RGB colorspace do not
- correlate very good with the human perception I decided to convert the
- RGB coordinates into the CIE L*a*b* coordinates as their distances
- give a much better correlation. Conversion formulas for the D65
- whitepoint of the ITU 601-1 standard are used. Most pictures are
- viewed with a monitor, and monitors usually have no linear response,
- i.e. the brightness function is more a parabola than a straight line.
- That is why I treat the palette RGB coordinates as if they were
- precorrected for a gamma value of 2.0 what I feel comes close to the
- typical monitor, and the colors we see on the monitor were the real
- colors.
-
- The pairing methods can be characterized as follows: The first one
- uses an heuristic approach, pairing the palette entries one after
- another, giving preference always to that entry with the maximum
- distance to its second nearest neighbour, and pairing it with its
- nearest. This gives optimum or nearly optimum pairs in most cases,
- but sometimes pairs with exceptional far distances are build. Those
- pairs are repeatedly split by changing partners with another pair that
- is 'in the space between' and minimizes the sum of squares of intra
- pair distances. This algorithm was designed somewhat arbitrarily but
- proved to produce reliable results.
-
- The second method constructs a matrix of reciprocals of the squares of
- the distances between every palette entry with every other and 0's in
- the main diagonal. With every iteration cycle every element of the
- matrix is replaced by its square divided by row and column sums. This
- tends to leave only one element of every row and column with value 1,
- preferably in those representing palette entries with minimal
- distance, and all other elements with 0. The palette entries
- corresponding to the rows and columns of the 1 elements are then
- paired together. Occasionally this algorithm may produce empty rows
- or columns especially with palettes with a large share of identical
- entries. Then the program aborts with a division by zero error. Care
- has been taken to avoid such a situation, but thers's no guaranty.
-
- After having grouped all palette entries into pairs, an average color
- of every pair is calculated, and then those pairs are again paired
- into pairs of pairs representing the next higher bit. This procedure
- is repeated until the palette is completely represented by a binary
- tree. The palette is then rearranged in the order of this binary tree
- and all pixels (= indices) changed to reflect the new order.
-
- All calculations are done with integer arithmetic and are therefore
- pretty fast! The executable has been compiled using the Microsoft
- Quick C 2.5 Compiler in the compact model (code near, data far).
-
-
- 4 - Source Code
-
- The program is written in ANSI C and should therefore easily be ported
- to other systems. I used standard ANSI calls wherever applicable to
- encourage those ports. There are some exceptions to notice:
-
- I had the general assumption that int are 16 bit and long are 32 bit.
- If this is not the case with your compiler than a lot of work would
- have to be done.
-
- The macros GETWORD and PUTWORD are used to read or write 16 bit words
- lower byte first. They are set to 'fputw' and 'fgetw' in in
- HIDE4PGP.H. If your system doesn't have these, you might have to add
- your own routines and change the macros. They must use exactly the
- same parameters as fgetc or fputc.
-
- The main module HIDE4PGP.C contains a 'setmode' call to set
- stdin/stdout to binary and therefore #includes 'io.h'. As this is not
- standard ANSI C, you might have to change it.
-
- If your system generally writes or reads 16 or 32 bit values in the
- order MSB first, you will have to change all appearances of 'fread'
- and 'fwrite' with parameter sizeof (int) or sizeof (long) too.
-
- The Makefile provided is a MS Quick C one and of little use for other
- compilers. Generally spoken you need all *.c and the *.h file to
- compile to *.o or similar and to be linked to one executable
- Hide4PGP.EXE.
-
- If you successfully ported this program to another computer system,
- please let me hear. My email address is 100434,2106 at compuserve or
- 100434.2106@compuserve.com via Internet. This is also the address for
- bug reports, recommendations or improvements!
-
-
- 5 - Other sources
-
- Information about the structure of WAV and VOC files I found in an
- article by Kai Schwirtzke in c't 1/93, p. 213. The BMP Windows format
- was described in a file named BMP.DOC in one CompuServe graphics forum
- - thanks to the unknown author. Infos about the OS/2 BMP structures I
- found in the OS/2 header files of the compiler - and by looking at the
- BMP files themselves. Information about the CIE colorspaces and
- conversion formulas I found in the Color spaces FAQ by David Bourgin
- (the conversion XYZ -> Lab for L* is wrong!) and an article by
- Michael Haas and Todd Newman of the International Color Consortium,
- 'Color Management: Current Practice and The Adoption of a New
- Standard'. Some of the algorithms are modified from Robert
- Sedgewick's 'Algorithms.'
-
-
-
- 6 - Legal Stuff
-
- You may freely use this software in any environment. You may make
- copies and freely distribute this software provided the archive
- contains all files of the original distribution and is not modified in
- any way. You may distribute it in any Freeware / Shareware / PD
- collection provided the charge for the data carrier is only nominal
- and covers only the media and copying costs. You may not sell this
- program or include it into a for profit product without written
- consent of the author. You may compile the source on any other
- computer system. You may change the source code and distribute the
- modified version only if you clearly state all differences, leave the
- copyright notice intact and inform the author.
-
- THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT WARRANTY OF ANY KIND.
- UNDER NO CIRCUMSTANCES SHALL THE AUTHOR BE LIABLE FOR ANY INCIDENTAL,
- SPECIAL OR CONSEQUENTIAL DAMAGES (INCLUDING DAMAGES FOR LOSS OF
- BUSINESS PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION
- AND THE LIKE) ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE
- OR ITS DOCUMENTATION.
-